AITopics | ap 50

With the rapidly increasing demand for oriented object detection, e.g. in autonomous driving and remote sensing, the recently proposed paradigm involving

artificial intelligence, detection, machine learning, (14 more...)

Neural Information Processing Systems

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Jiangsu Province (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

b9009beb804fa097c04d226a8ba5102e-Supplemental.pdf

Neural Information Processing SystemsFeb-10-2026, 21:22:38 GMT

ascal voc benchmark, benchmark, parameterized ap loss, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.54)

Add feedback

4f16c818875d9fcb6867c7bdc89be7eb-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 15:15:13 GMT

ap 50, prediction, uncertainty-aware soft target, (13 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.43)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.43)

Add feedback

Appendix A Theoretical Derivation of P-V AE

Neural Information Processing SystemsFeb-7-2026, 13:05:13 GMT

For both GP-V AE and CP-V AE, the number of attention heads is empirically set to 4. We customize

ap 25, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Prototypical Variational Autoencoder for Few-shot 3D Point Cloud Object Detection Weiliang T ang

Neural Information Processing SystemsFeb-7-2026, 13:05:09 GMT

Few-Shot 3D Point Cloud Object Detection (FS3D) is a challenging task, aiming to detect 3D objects of novel classes using only limited annotated samples for training.

artificial intelligence, machine learning, prototype, (13 more...)

Neural Information Processing Systems

Country:

Asia > China > Hong Kong (0.05)
Asia > China > Guangdong Province > Shenzhen (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Dual-Stream Spectral Decoupling Distillation for Remote Sensing Object Detection

Gao, Xiangyi, Zhao, Danpei, Yuan, Bo, Li, Wentao

arXiv.org Artificial IntelligenceDec-5-2025

Knowledge distillation is an effective and hardware-friendly method, which plays a key role in lightweighting remote sensing object detection. However, existing distillation methods often encounter the issue of mixed features in remote sensing images (RSIs), and neglect the discrepancies caused by subtle feature variations, leading to entangled knowledge confusion. To address these challenges, we propose an architecture-agnostic distillation method named Dual-Stream Spectral Decoupling Distillation (DS2D2) for universal remote sensing object detection tasks. Specifically, DS2D2 integrates explicit and implicit distillation grounded in spectral decomposition. Firstly, the first-order wavelet transform is applied for spectral decomposition to preserve the critical spatial characteristics of RSIs. Leveraging this spatial preservation, a Density-Independent Scale Weight (DISW) is designed to address the challenges of dense and small object detection common in RSIs. Secondly, we show implicit knowledge hidden in subtle student-teacher feature discrepancies, which significantly influence predictions when activated by detection heads. This implicit knowledge is extracted via full-frequency and high-frequency amplifiers, which map feature differences to prediction deviations. Extensive experiments on DIOR and DOTA datasets validate the effectiveness of the proposed method. Specifically, on DIOR dataset, DS2D2 achieves improvements of 4.2% in AP50 for RetinaNet and 3.8% in AP50 for Faster R-CNN, outperforming existing distillation approaches. The source code will be available at https://github.com/PolarAid/DS2D2.

artificial intelligence, distillation, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TGRS.2025.3600098

2512.04413

Genre: Research Report (1.00)

Industry:

Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (1.00)
Education (0.94)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

FMC-DETR: Frequency-Decoupled Multi-Domain Coordination for Aerial-View Object Detection

Liang, Ben, Liu, Yuan, Qiu, Bingwen, Wang, Yihong, Sui, Xiubao, Chen, Qian

arXiv.org Artificial IntelligenceSep-30-2025

Aerial-view object detection is a critical technology for real-world applications such as natural resource monitoring, traffic management, and UAV-based search and rescue. Detecting tiny objects in high-resolution aerial imagery presents a long-standing challenge due to their limited visual cues and the difficulty of modeling global context in complex scenes. Existing methods are often hampered by delayed contextual fusion and inadequate non-linear modeling, failing to effectively use global information to refine shallow features and thus encountering a performance bottleneck. To address these challenges, we propose FMC-DETR, a novel framework with frequency-decoupled fusion for aerial-view object detection. First, we introduce the Wavelet Kolmogorov-Arnold Transformer (WeKat) backbone, which applies cascaded wavelet transforms to enhance global low-frequency context perception in shallow features while preserving fine-grained details, and employs Kolmogorov-Arnold networks to achieve adaptive non-linear modeling of multi-scale dependencies. Next, a lightweight Cross-stage Partial Fusion (CPF) module reduces redundancy and improves multi-scale feature interaction. Finally, we introduce the Multi-Domain Feature Coordination (MDFC) module, which unifies spatial, frequency, and structural priors to to balance detail preservation and global enhancement. Extensive experiments on benchmark aerial-view datasets demonstrate that FMC-DETR achieves state-of-the-art performance with fewer parameters. On the challenging VisDrone dataset, our model achieves improvements of 6.5% AP and 8.2% AP50 over the baseline, highlighting its effectiveness in tiny object detection. The code can be accessed at https://github.com/bloomingvision/FMC-DETR.

artificial intelligence, detection, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2509.23056

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry: Transportation (0.88)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Clustering-based Feature Representation Learning for Oracle Bone Inscriptions Detection

Tao, Ye, Fu, Xinran, Pang, Honglin, Yang, Xi, Li, Chuntao

arXiv.org Artificial IntelligenceAug-27-2025

Oracle Bone Inscriptions (OBIs), play a crucial role in understanding ancient Chinese civilization. The automated detection of OBIs from rubbing images represents a fundamental yet challenging task in digital archaeology, primarily due to various degradation factors including noise and cracks that limit the effectiveness of conventional detection networks. To address these challenges, we propose a novel clustering-based feature space representation learning method. Our approach uniquely leverages the Oracle Bones Character (OBC) font library dataset as prior knowledge to enhance feature extraction in the detection network through clustering-based representation learning. The method incorporates a specialized loss function derived from clustering results to optimize feature representation, which is then integrated into the total network loss. We validate the effectiveness of our method by conducting experiments on two OBIs detection dataset using three mainstream detection frameworks: Faster R-CNN, DETR, and Sparse R-CNN. Through extensive experimentation, all frameworks demonstrate significant performance improvements.

artificial intelligence, knowledge, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2508.18641

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)

Add feedback

Filters

Collaborating Authors

ap 50

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

4f16c818875d9fcb6867c7bdc89be7eb-Supplemental.pdf

076a93fd42aa85f5ccee921a01d77dd5-Paper-Conference.pdf

H2RBox-v2: Incorporating Symmetry for Boosting Horizontal Box Supervised Oriented Object Detection Yi Y u 1, Xue Y ang

b9009beb804fa097c04d226a8ba5102e-Supplemental.pdf

4f16c818875d9fcb6867c7bdc89be7eb-Supplemental.pdf

Appendix A Theoretical Derivation of P-V AE

Prototypical Variational Autoencoder for Few-shot 3D Point Cloud Object Detection Weiliang T ang

Dual-Stream Spectral Decoupling Distillation for Remote Sensing Object Detection

FMC-DETR: Frequency-Decoupled Multi-Domain Coordination for Aerial-View Object Detection

Clustering-based Feature Representation Learning for Oracle Bone Inscriptions Detection